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Abstract 

We investigate the possibility of extending some results of Pazman and Pronzato 
(2014) to a larger set of optimality criteria. Namely, in a linear regression model the 
problem of computing D-, A-, E'^-optimal designs, of combining these optimality 
criteria, and the “criterion robust” problem of Harman (2004) are reformulated 
here as “infinite-dimensional” linear programming problems. Approximate optimum 
designs can then be computed by a modified cutting-plane method, and this is 
checked on examples. Finally, the expressions for these criteria are reformulated in 
terms of the response function of an even nonlinear model. 

Keywords: Regression models, optimality criteria, concave maximization, cntting-plane 

method, criterion-robnst design. 

1 Introduction 

We consider a regression model 


yi = ri (xi, 9)+ei, i = 

where t/i are observed variables, Si are observation errors, which satisfy E (sj) = 0, and 
Var (si) = Cov (e*, Sj) = 0 for i 7^ j, is not snpposed to be known. The valne of 6 
is a priori restricted to a parameter space 0. In a vector notation the model is 

y = Vx {0) + £, 

E (e) = 0, Var (e) = aV. 

Here X = (xi,..., xn) is the exact design with points Xi ^ X, y = {yi ,..., j/at)^, e = 
(£1 ,..., eAf)"'", r]x (9) = iv {xi,9),... , 1 ] {xn, 9)^ . The design space X is snpposed here 
to be hnite. Instead of the exact design X we can consider eqnivalently for any x E X 
the valne (2^) of the relative frequency of x within the design X. By a standard 
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approximation procedure, we consider the set S of all probability measures defined on A”, 
as the set of all approximate designs allowed in the experiment. 

In the main part of the present paper we suppose the linearity of the response function 
7] (xj, 9) = (xj) 9, and we suppose 0 = MP. In a standard way, to any ^ G S is associated 
its information matrix 

(0 = 5^/ (a;) ^ , 

x&X 

with /(x) = (/i(x),...,/p(x))''^. According to the aim of the experiment, we may 
choose an optimality criterion 0(0; and a design /i is considered 0-optimal when 0 (/i) = 
max^gs 0 (,^). Standard criteria 0 (•) are concave functions on S having a statistical inter¬ 
pretation. 

In Pronzato and Pazman (2013) the criteria of E-, c-, and G-optimality were consid¬ 
ered, and the corresponding criteria functions have been rewritten in a form 

0 (^) = min T (u, x) ^ (x) 

with given T {u, x). This, together with the standard restrictions on dehnes an “inhnite- 
dimensional” linear programming (LP) problem: to choose the values of ^ (x); x G A and 
of f G M so to maximize t under inhnitely many linear restrictions: 

yy T {u, x) ^ (x) > t for any u G 

xex 

^ (x) > 0 for any x G A, and ^ (x) = 1. 


In particular, for i?-optimahty, with 4>e{0 equal to the minimum eigenvalue of M (^), we 
have 


0 E (0 = min 

ueRp 


u^M{^)u 


vJu 


= mm 

u&RP 




xex 


u^u 


The main idea of Pazman and Pronzato (2014) was to substitute the nonlinear response 
function rj (x, 9) instead of (x) 9 and so to obtain new criteria for nonlinear models, 
with the aim to detect the lack of identihability under the design However, a second 
aim of Pazman and Pronzato (2014) was to point attention to the fact that for those 
expressions for criteria an LP method could be used to obtain nearly optimum designs in 
linear models. 

In the present paper we follow this second aim, but for Z1-, A-, and T’fc-optimality 
criteria and also for the computationally not easy task to hnd the “criterion robust” 
optimum design in linear models, or to hnd a H-optimum design under the condition that 
the A-optimality criterion exceeds a given value. The difficulties to achieve also the hrst 
aim for D-, A-, and i^fc-criteria are discussed in Appendix. 

We notice that LP method has been used to compute c-optimal design in Harman and 
Jurik (2008) but under a quite different set-up. 


2 Reformulation of the optimality criteria 

The H-optimal design maximizes det (M (,^)), hence minimizes the generalized variance of 
9, the BLUE of 9. The A-optimal design minimizes the sum of the variances oi 9i,... ,9p. 
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The T'fc-optimal design maximizes the sum of the smallest k eigenvalues of M There 
are many forms of expressing the corresponding criteria functions 0(0- All forms of 0 (^) 
representing the same criterion maintain the ordering of the designs but differ by the scal¬ 
ing of this ordering, say 0(0 = Indet [M (0] and 0(0 = det^^^ [M (0] for D-optimality, 
and similarly for the other criteria. Here we prefer criteria functions which are not only 
concave, but also positively homogeneous, 0 (aO = a0 (0 for a > 0 (see Pukelsheim 
(1993) for a justihcation). So for the D-optimality 0 d (0 = det^^^[M(0], for the A- 
optimality 0^ (0 = 1/tr [M“^(0] when M (0 is nonsingular, and for the i^fc-optimality 
(0 =-^1 (0 + • • • + (0 where Ai (0 < A 2 (0 < • • • < (0 is the ordering of eigen¬ 

values of M (0 respecting their multiplicity. Denote mi (0 , • • •, Wp (0 the corresponding 
orthonormal eigenvectors of M (^). Denote also S'*" = {p G S : M (p) is nonsingular}. 
D- and H-optimal designs are evidently localized on S’*", what need not to be true for the 
T’fc-optimality. 

Theorem 1. We can write 


(0 = mm y'i7D(h,3:)^(a 

/xeH+ 

\ det^/P [M (p)] 


x&X 


= mm 

MG 


sE ' 


f (x) M ^ (/i) / (x) K ’ 


P 


0 a(O = min ViP^(/i,x)^(x) 

/xeH+ 


x&X 


= mm 
/xes 


sE 


IIjm-'W/W 


^(x) 


x&X 


for any ^ G S"*", and 

(0 = min ViPB,(/i,x)^ (x) = min V (p) / (x)^ (x) 


x&X 




xdX 


( 1 ) 

( 2 ) 

(3) 


for any G H. Here (/i) is the k-dimensional orthogonal projector P^^i (p) = 
(p); INI denotes the Euclidean norm. 

Proof. In the proof we shall often use that tr [AB] = tr [BA\ for any matrices A = 
Aixs,B = Bsxi (Harville, 2000). By the known inequality between the geometric and 
arithmetic means of positive numbers (cf. (Steele, 2004, Chap. 2)), we obtain 


{det [STAfKjS]}'''" 



1 

P 


tr [S'^MiO S] 


for any nonsingular p x p matrix S. Here Oi,..., Op are the eigenvalues of S~^M {^) S. 
So det^'^^ [M (^)] < i det“^'^^[S'S'''"] J2x^x f~^ (^) SS~^ f (x) f (x), and we have just to put 
S = to obtain the expression in (1). If S' = then Oj = 1; i = 1.. .p, 

and the geometric mean is equal to the arithmetic mean, so the minimum is attained. 
For any nonsingular p x p matrix S we obtain from the Schwarz inequality 

l(r(S)]" = {(r [M-'/" K) SM'/yO] 

< tr [M-^ (0] tr = tr (0] tr [SM (0 
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since in general tr is a scalar product of matrices A, 5, and since (^) and 

7 V/ 1/2 (^) symmetric matrices. So {tr [M~^ < tr [S M S^] / [tr (S)]'^ = 

X]* 11*5'/ ( 3 ^) 11 ^^ (^) / [tr (B)]^, and we have just to put S = M~^ (/r) to obtain the ex¬ 
pression in (2). When S = M~^ (,^), we obtain evidently an equality in the Schwarz 
inequality. 

Denote P = (/i). By the dehnition of P^^'^ (/r) we have PP = P and P = P^. So 

On the other hand, denote U = {ui Up {^)), A = diag {Ai (0 ) • • •) (O); 

use that M {^) = UAU^ to obtain 


tr [PM (0 P] = tr [PUAU^P] = tr A {PU^ {PU) 




i=l 


Ea.K) ||P«.K)f = EA.K)ti’i, 


i=l 


i=l 


where we denoted Wj = 


T 


= \\Pui (.^)|| . Since UU' = U'U = I, we have 


k = tr [P] = tr [P^P] = tr [P^PUU^] = ^ 

i=l i=l 


Further Wi G [0,1], since 0 < ||PMj(.^)||^ < ||^^^(0ll^ = 1- So, using that Ai (.^) < 

... < Ap (.^) we obtain that Yl^=i (0 minimized exactly when the weights Wi have 
maximum value (= 1) at the smallest k values of Aj (.^). 

Summarizing we obtain 

p k 

\\pf (x)iiy (x) = tr [PM (0 p| = E (?) > E K) = (^) ■ (4) 

xdX i=l i=l 

In the particular case that P = (^) = Yl’j=i (0 (0 have Wi = (0 (0|f 

lki(0ll^ = 1 if i < fc, ||P(^) (0ir = 0 if ^ > /c, hence ||p(^) (0 / (^) if ^ (a^) = 

Yli=i (0 = (0; which together with (4) yields an expression in (3). 

□ 

Remark 1. We could write in (2) 0 a (0 = niiuegg J2xex | where B is 

any set of nonsingular matrices containing M~^ (,^). When this formula should hold for 
all ^ G S, then the set B = {M~^ (/i) : /i G S’*'} is the smallest of such sets. A similar 
modihcation could be done for P-optimality in (1). In (3) we could minimize over any 
set of /c-dimensional projectors containing P^^'^ (^). 

Remark 2. As follows from (Pronzato and Pazman, 2013, Chap. 9.5) we could obtain 
similar results as in Theorem 1 by considering gradients or subgradients of 0 (^). However, 
the presented direct proofs, without using a not very common notion of subgradients, can 
be more attractive for people in applications. 
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3 The iterative computation by LP; the algorithms 
and examples 


3.1 Algorithm for D-, A-, and E'^-optimality 

Let us write instead of Hd{^,x), or HeAi^^x) from Theorem 1. For 

the maximization of 0 we apply a modihcation of the cutting-plane method Kelley (1960) 
as presented in Pronzato and Pazman (2013) and Pazman and Pronzato (2014): 

0. Take any vector such that ^ ^ > 0 V x G df, choose 

e > 0 , set = 0 and n = 0 . 

1. Set U 

2. Use the LP solver to find so to maximize t satisfying the constraints: 

• t > 0, ^(x) > 0 Vx e df, = 1, 

• ExG;r^(F,a:)^(x) > t V/i e 

3. Set —0 if < e take ^U+i) an e-optimal design and 

stop, or else n n + 1 and continue by step 1 . 


Notice that min^g=(n+i) J2xgx upper piecewise linear approximation 

of 0(0- Increasing n, the set C S becomes larger and the approximation is better. 

On the other hand, when n is small, the information matrix M could be ill- 

conditioned or even singular. In order to avoid the difficulty with inverse matrices in 
D- and A-optimality, it is possible to use any symmetric positive dehnite matrix as a 
substitute for M as justihed in Remark 1. Alternatively, Pronzato and Pazman 

(2013) recommend the regularization M - 1 - 7 /, where 7 is a small positive number 
and I is the identity matrix. Note that it is also possible to take as an nonempty 
set containing s > 1 initial designs. If s or n is large, the probability of ill-conditioned or 
singular information matrix M is less. 

The problem of singular information matrix does not appear in U^-optimality criteria. 

The stopping rule used in the above algorithm follows from the upper and lower bounds 
for max^g= 0 ( 0 - 

<t> (?'”+") < max 0(0 < 


The first inequality is obvious. Note that = max^gs min^g 2 (n+i) Yhx&x 

while max^gs 0(0 = max^gs min^gs iL(/i, x) 03^)5 and S 3 This yields the 

second inequality. 

There are also available stopping rules based on the equivalence theorem (Kiefer, 1974; 
Kiefer and Wolfowitz, 1959), which are considered as standard. Let estop be a chosen 
small nonnegative number. An iterative algorithm will stop if d (O”^) < ^stop, where 
for H-optimality d = |niax 3 ;g;^/'''(x)M“^ f{x) — p\ and for the criterion of 

A-optimality d ( 0 ”^) = |niaxa,g^ f~^{x)M~‘^ f{x) — tr [M~^ (^*'” 0 ] | as seen e.g. in 
Kiefer (1974, 1975). According to Harman (2004) the stopping rule for Ufe-optimality 

criteria is d ( 0 ”^) = 4>Ek “ niax^jg^- Yl^=i i which can be used only 

if Afc ( 0 ")) < Afc+i ( 0 "^). 
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As mentioned in (Pronzato and Pazman, 2013, Chap. 9.5), the cutting-plane method 
can have bad convergence properties (referenced to Bonnans et al. (2006); Nesterov 
(2004)), one can then use the level method (see Nesterov (2004) or Pronzato and Pazman 
(2013)), which adds the quadratic programming step in the method of cutting planes. 

In the examples below we compare the known optimal designs with results of our 
algorithm. The computations were performed in Matlab on a bi-processor PC (3.10 Ghz) 
equipped with 6 GB of RAM and with 64 bits Windows 8.1. LP problems were solved 
with interior point method. 

Example 1. Consider the nonlinear regression model of Atkinson et al. (1993). 

r]{x, 9) = 9i [exp{-92x) - exp{-9sx )], x e M+, 9 = {9i, 9^, 9‘iY. 

We use the algorithm of Sec. 3.1 to compute local D- and Ei-optimal designs for the nom¬ 
inal value of the parameter 9^ = (21.8,0.05884,4.298)"'', so we shall write d7]{x,9)/d9Y 
instead of /(x) everywhere. We take a finite design space containing 24,000 points 
A = {0.001, 0.002,..., 23.999, 24.000}, e = 10“^° with ^(o)(x) = 1/3 if x G {0.2,1,23} 
and ^^°^(x) = 0 otherwise. The computed designs are given in Table 1. Notice that the 
computed results correspond to those in Atkinson et al. (1993). 


0 

0 

0* 

iter. 

time 

d(r) 

D 

0.229 1.389 18.417 

0.3333 0.3333 0.3333 

01) = 11.739 

64 

16m 9s 

1.5-10-'^ 

El 

0.169 1.394 23.402 23.403 

0.1993 0.6623 0.0415 0.0969 

0^^ = 0.3163 

49 

5m 53s 

3.89 ■ 10-6 


Table 1: Example 1: the locally optimal designs are and (column 2); (j)*jj = 
4 >dYd) ^Ei — (column 3); the number of iterations (column 4) and the 

computational time (column 5) required until the algorithm stopped; the corresponding 
value of dY*) based on the equivalence theorem (column 6 ). 


3.2 Algorithm for computing criterion robust designs 


The criteria of E^-optimality play a special role in experimental design. We say that 
the design ^ is not worse than the design p with respect to the Schur ordering of de¬ 
signs if Y) — 4>Ek (p) for all k = 1,... ,p. Then also 0(0 ^0 Y) for many other 
optimality criteria. However, the Schur ordering is a partial ordering of designs, and a 
Schur-optimal design exists only in some very particular cases. On the other hand, if 
we denote by O the set of all criteria functions 0(0) which are concave and positive 
homogeneous, and moreover are orthogonally invariant in the sense that 0 (0 = [M (0] 
with $ [M (0]=‘h \U~^M (0 U~\ for any orthogonal matrix U, it makes sense to look for a 
design 0/ which is maximin efficient with respect to such criteria, i.e. 


0 / = argmaxmin 


0(0 


0eo [max^g2 0(O 


Here the ratio 




max - 0(0 called the 0-efficiency of the design 0 This maximin efficiency 
problem can be simplihed (cf. Harman (2004)), the solution 0/ coincides with the solution 
of 

• \ (0 

Qef = argmax mm -^- — 

i<k<p max^gs0E (C) 
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i.e. with the design which is maximin efficient in the (finite) class of all E’fc-optimality cri¬ 
teria. Snch a design is called also “criterion robust” in Harman (2004). But even this prob¬ 
lem is computationally difficult, mainly because the i?fc-optimality criteria are not differen¬ 
tiable. For us it is important that we can approach the solution of this problem by the LP 
programming technique. First, using Theorem 1 we compute Ek {opt) = max^gs <pEk (C) 
for all k (see Sec. 3.1), and then we can formulate another “infinite-dimensional” LP 
problem: to choose the values of .^(x); x E X and of f G M so to maximize t under linear 
constraints: 

^ ^ g^^d for cvcry A: G {1,... ,p}, 

xdX k\ P ) 

i{x) > 0 for any x E X, and ^ (x) = 1. 

In order to compute the maximin efficient design, the algorithm of Sec. 3.1 needs to 
be modified in step 2. Actually, the constraints in the LP problem will be: 

• A > 0, ^(x) > 0 V X G df, 

• ExgA' ^ V/i G and V /c = 1,... ,p, 

where Ek{opt) = max^g 2 0£;fe(C) is computed using the unmodified algorithm of Sec. 3.1 
for all A: = 1,... ,p. 

Example 2. Consider the quadratic regression model on a g-dimensional cube: 

y = 1^0 + '^ Pix'^i + ^ PijXiXj + e, x= (xi,..., Xg)"^ G [-1,1]'' (5) 

2=1 2=1 i<j 

with a parameter fi = (/So, /Si,..., /Sg, ..., /S^^i, /S 12 ,..., /Sg-i^g)"*" of dimension p = 
1 + ?)/2q + /2. The criterion robust design in the model (5) was analytically studied for 

g = 1 in Harman (2004) and for g = 2 in Filova and Harman (2013). The case of g = 3 
was numerically solved in Filova and Harman (2013). 

Consider the set Q = {x G { — 1,0,1}'^ : 'Yl]=i\xj\ = for * = 0,1. ...g. Thus, 
Co = {(0,..., 0)^} and Cq is the set of all vertices of the g-dimensional cube. We shall 
denote C = lJi=o 'C(Ci) = YlxeCi ^{^)- mentioned in Filova and Harman (2013), 

for every (j) E O there exists a 0-optimal design with support on C, such that for all 
i = 0,1,... ,g the measure i*{Ci) is uniformly distributed over points x E Ci (see also 
Gaffke (1987); Heiligers (1992)). 

Before computing the criterion robust designs, we needed to evaluate Ek{opt) for 
A: = 1,.. .p. The algorithm of Sec. 3.1 initialized with the uniform measure on X = C and 
with e = 10“^*^ gave the optimal values Ek{opt) summarized in Table 2 for g = 1,2, 3,4. 
We observed the same optimal designs as calculated in Harman (2004) for g = 1 and in 
Filova and Harman (2013) for g = 2, 3. 

Then using the algorithm of Sec. 3.2 we computed criterion robust designs on X = C 
for g = 1,2, 3,4 obtaining the same results (except g = 4) as in Harman (2004); Filova 
and Harman (2013), and the optimal mass concentrated on Ci is listed in Table 3. Note, 
that for g = 3 and g = 4 the optimal design computed by algorithm of Sec. 3.2 does 
not put mass uniformly among x E Ci with i = 0,... g. By redistributing the mass ^*{Ci) 
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time 


q 


1 

k 

Ek{opt) 


12 3 

0.2 1 3 


2 s 

o 

k 


1 2 3 to 5 6 


17s 

z 

Ek{opt) 


0.2 0.407 k-2 6 


3 

k 

Ek{opt) 

1 

0.2 

2 3 4 5 to 9 10 

0.4 0.667 1.027 fc-3 10 


28m 17s 

4 

k 

Ek{opt) 

1 2 

0.2 0.0433 

3 4 5 6 to 14 

0.6242 0.4834 1.250 k - A 

15 

15 

23h 32m 47s 


Table 2: Example 2: the optimal values of the Efc-optimality criteria on a g-dimensional 
cube for the model (5) and the total computational time required until the optimal values 
for all k = 1,... ,p together were evaluated. 


q 

r 

v[/* 

iter. 

time 

1 

r(c^o) r(c’i) 

0.3532 0.6468 

0.7646 

16 

Is 

2 

r(c'o) r(^^i) r(^^2) 

0.1775 0.2924 0.5304 

0.7060 

108 

Im 20s 

3 

r(a) r(^^i) r(c'2) c{c,) 

0.0884 0.2343 0.2306 0.4467 

0.6642 

464 

17m 9s 

4 

r(c'o) r(c'i) nc2) c{Ca) 

0.1097 0.0559 0.1437 0.3062 0.3845 

0.6526 

1453 

lOh 20m 7s 


Table 3: Example 2: criterion robust designs (column 2) and the (9-minimal efficiency 
of i.e. T* = max^gH min^ (column 3) on a g-dimensional cube for the model (5); 

the number of iterations (column 4) and the computational time (column 5) required until 
the algorithm stopped. 


uniformly over a; G Q for z = 0,..., g, we obtained new design of identical (9-minimal 
efficiency as achieved in Thus, is another criterion robust design with required 
uniform measure on Ci for any i = 0,... q. 

Alternatively, we computed the criterion robust design for q = 2 (thus p = 6 ) on a 
modihed design space X = {—1,—0.95,..., 0.9, 0.95,1}^ (i.e. A is a grid consisting of 
1,681 two-dimensional points including the set C). Assuming that the values Ek{opt) are 
known or previously computed for all k G 1,... ,p, the algorithm of Sec. 3.2 initialized 
with uniform measure on X and e = 10“^° converged after 102 iterations in 36m and 5s 
with the same results as given in Table 3. 

3.3 Algorithm for D-optimality conditioned by prescribed A- 
optimality 

It is not difficult to see that in the considered LP problems we can easily add some 
supplementary constraints linear in say a cost constraint ^ A) ^ A) = c, where 

c (x) is the cost of an observation at x and c is proportional to the total cost allowed for the 
whole experiment. What is less evident is that we can combine optimality criteria. Say, 
when we want to obtain a Zi)-optimal design under the condition that the A-optimality 
criterion attains a prescribed value a, we have to solve the “infinite-dimensional” LP 















problem: to choose the values of ^(x); x E X and of t G M so to maximize t under linear 
constraints: 


> t for any p G H"*", 

> a for any n G 

> 0 for any x G A”, and ^ (x) = 1. 

x&X 

This problem can be solved by the algorithm of Sec. 3.1 with a modification in constraints 
of the LP problem and in the stopping rule. 

0. Take any vector such that ^ > 0 V x G A, choose 

e_D > 0, ^ 0 , set = 0 and n = 0. 

1. Set U 

2. Use the LP solver to find so to maximize t satisfying the constraints: 

• t > 0, ^(x) > 0 Vx G A, 

• Exe.v x)i{x) > f Vp G 

• Y.X&X x)^(x) >a\/^iE 

3. Set - (t)D and - a. If A^+^^ < eo and 

> 6a take ^U+i) as an (e^, 5^)-optimal design and stop, or else n ■(— n + 1 
and continue by step 1 . 


^Hd{h,x)^ (x) 

x^X 

^Ha{h,x)^ (x) 

x&X 

i{x) 


The constant 6a is chosen at the beginning of the algorithm. The preferred value is 
6a = 0 , however choosing < 0 but small, we can reduce the strictness of the condition 
on A-optimality. 

Now consider the set = {^ E E : J2xex ^A{fJ^,x)^{x) > a Wfi E then 

^(n) 

D D ^ G S : 0^(^) > a}. So the exact solution of our problem would 

be = argmaxjg^ 0i:i(^). We can write: 


0n+i) _ iPnfu,x)£(x) 

«eA-+i) m 6 S(-+i) ^ 

> max min > Hd(u,x)3(x)= max doiC), 

SeA"+i) M6H ^ J J ^e^(n+i) 


and then 


where 


Assume that 6a 
> a. 


max 0 B(O A niax 0 z)(O A ( 6 ) 

?6.4 ceA "+0 

<t>D < max >D«) <*'"+■', (7) 

?gA"+i) 

y(n.+l) _ Hr)(u,x)6(x). 

SeA^+i) ^ghC^+D ^ 
xGA, 

= 0 and the algorithm stopped, i.e. and 

According to ( 6 ) and (7) there are only two possibilities: first, if 
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max^g^ < (po then g is even “better” design than 

we expected; second, (po < niax^g _4 0 £)(^) < and the stopping rule implies 

that max^^^cpoiO ~ 4 >d < cd, thus is an CD-optimal design in both cases. 

Example 3. Consider the polynomial regression model of degree d: 


y — do 9ix + 92x‘^ + ... + Odx'^ + £, X G [—1,1], 0 — (^ 0 ) di..., 9d)~^. 

Denote by the D-optimal design under the condition that the A-criterion exceeds a 
value a. Set X = {—1.00, —0.99, —0.98,..., 0.99,1.00} as the design space, suppose that 
the initial design allocates the unit mass uniformly to each x E X, en = 10”^°, and 
6a = 0. In Table 4 are given optimal designs for some particular values of a and for 
d = 4 computed by the algorithm of Sec. 3.3 with abovementioned setting. Notice that the 
D- and A-optimal (maximum) values are (p)^ = 0.1339 and (p\ = 0.0053 respectively (see 
the optimal designs in polynomial regression in (Atkinson and Donev, 1992, Chap. 11) 
and Pukelsheim and Torsney (1991)). When a is small, the algorithm of Sec. 3.3 will 
compute the D-optimal design. The initial knowledge of (p\ is necessary because if a 
exceeds (p\, the algorithm does not work. Figure 1 displays <pD and (pA efficiencies of 
as a function of a, i.e. effB(a) = (pDiCD\a)/^D and eff^(a) = 0 a(^})|J/(/>a- 


a 







^*D\a 

MCoJ 

iter. 

time 

0.0052 

-1 

0.136 

- 0.68 

0 

0.68 

1 

0.136 


0.1283 

0.0052 

97 

67s 

0.2338 

0.2604 

0.2338 


0.005 

-1 

0.1623 

- 0.68 

0 

0.68 

1 

0.1623 


0.1317 

0.005 

97 

53s 

0.2194 

0.2366 

0.2194 


0.002 

-1 - 0.66 

-0.65 

0 

0.65 

0.66 

1 

0.2 

0.1338 

0.0044 

163 

158s 

0.2 0.0847 

0.1152 

0.2 

0.1151 

0.0850 


Table 4: Example 3: the optimal designs (column 2) with different choices of a 
(column 1); (p*j^^^ = (pD{^i)\a)~ ^^e value of the D-optimality criterion (column 3); 0A('Cz)|a)” 
the value of the A-optimality criterion (column 4); the number of iterations (column 5) 
and the computational time (column 6 ) required until the algorithm stopped. 


4 Reformulation of AVE criteria in nonlinear exper¬ 
iments 

In general, the information matrix in nonlinear regression model y = rjxid) + e is a 
function of the parameter 9. Similarly as in Theorem 1, we rewrite (local) D-, A-, and 
Efc-optimality criteria in nonlinear regression model to a form: 

0(^, 9) = min ^ id(/i, x, 9)i{x). ( 8 ) 

Here S* can be replaced by S or S’*" depending on the considered criterion like in Theo¬ 
rem 1. The reformulation of expressions in Theorem 1 in terms of average (AVE) opti¬ 
mality criteria jQ<p{x,9)d7i{9), where 7i{9) is supposed to be known prior distribution, is 
also possible and is given in Theorem 2. 
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Figure 1: The graph of (/)A-efficiency (dashed line) and of ^D-efficiency (solid line) of 
as a function of prescribed value a in Example 3. 


Theorem 2. We can write 

where K{ia,x,6) = Jq Tr(p., x, 0)d7r(0). 

Proof. The design space W is assumed to be hnite, hence the summation and the integra¬ 
tion are interchangeable. From (8) we have 0(^, 9) < J2xex ^)^(^) for any /r G S* 

and for all 6^ G 0. We can write 


/ 0(^, 9)d7r{9) = min ^ 


'0 


(j){fy9)d7r{9) < 


xGX 


H{fi, X, 9)d7i{9) 


'e 


f{x) 


Since the inequality (9) holds for every /r G S*, evidently: 


/ 0(^, 9)d7i{9) < min Y 


H(ja, X, 9)d7i{9) 


ue 




(9) 


( 10 ) 


Theorem 1 implies that minimum is in (8) attained ai fi = f for any 6* G 0, so we obtain 
an equality in (9) for fi = ^, which together with (10) proofs the theorem. □ 


Appendix: Reformulation of criteria in terms of non¬ 
linear models 

Using the notation r] [x, 9) = f~^ {x) 9 we can rewrite the expressions from Theorem 1 to 
a form, which formally allows an extension of criteria to a nonlinear model 

= 7]{x,9)+e:c, 

9 G 0 C RP. 

However, for the D-, A-, and Efc-optimality criteria we are not so successful as for the 
E-, C-, and G-optimality criteria in Pazman and Pronzato (2014). Therefore we put the 
corresponding constructions only in the Appendix. 
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Theorem 3. Let 6^^^ E Q be a given vector. Denote 


V0(o) = I ... ,0(P)) : Vi 0« e 0, (0« - 0(°)) ^ 0, (0« - 0(1))^ (0(^) - 0(°)) = 0, i 7^ j| . 
Further denote by ||0(*) — 0® || the Euclidean norm of — 0(°), and 

||r7(-,0W) = X] [r/(a:,0W) - r/(x, 0(°))] ^ ^ (x) . 

T/ie “extended” criteria defined as: 
fieD (0 = 




mm 


(i/p) ELi 


( 0 (l),..., 0 (p))GVg(O) 


0 eA (0 = , min 

(0(i),...,0(p))gV^(o) 


n^^ji0O)-0(o)ir 


11/p 


y^p 
Z^i=l 1 

0(0 -0(0 

2 

0 (-7^*^) -0 f 


y^p 

Z^j=i 

||0(/) -0(0|f’ 

2 


10 (■) “ 0 (■) 


‘/'eBi. (0 = min > 

( 0 (i)...., 0 W)gV^(o) ^ || 0 W - 0 (o)|| 

coincide with those in Theorem 1 in case that the model is linear. 

Proof. Consider first the expression for (pD (^) in Theorem 1. Using the notation from 
Sec. 2 for every /i G we can write M~^ (p) = Y^^=i ih) ^7 ih) with z/j (p) = 
Mi (/i) /\/Xi (/i) (the normed eigenvector divided by the square root of the eigenvalue), 
and ||z/i {fi)\7 = A“i (fv). It follows that 

deti/P[M(/i)] 


P 


7 ( 2 ;) M (/i) / (x) = 


(i 7 )ELi 




1/p 


Denote 6^^^ {fi) = 0(°) + r'i (/r). In the linear model f~^{x)ui{fi) = rj (^x, 6^^^ {fi)) 
T] (x, 0(1) (/i)). So from Theorem 1 it follows that 


^ (i/p) ELi II 0 - 0 (-7® (/^)) 

(pD (U = mm- 






1 i/p 


( 11 ) 


Evidently (0(i) (p),..., 0(0 (p)) G V^co). On the other hand, for any (0(i),..., 0(0) G 

r .-n -1 


Vg(o) we define B = 


From Remark 1 of Theorem 1 it 


( 0(0 - 0 ( 0 ) ( 0(0 - 0 ( 0 ) 
follows that we can take the minimum in (11) with respect to all (0(i),... ,0(0) G Vg(o) 
and not with respect to all /i G S+. 

We proceed similarly for A-optimality. We have (/i) = Eili IID (ai)II^ O (t) E (t) 
and tr [M"! (/i)] = A^i (p), so 

\\M-^fi) f {x)f ELi IId(/i)||^ [f^{x)ui{fv)Y 


{tr [M-^{fi)]y 


ELi ih) - 0(0 if [0 (a:,0(0 (/i)) - r; (x, 0(0 (/i))]' 


EEill^^^'^ 7)-^0)|r 


1 2 
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For the F’^-optimality criterion we write (/i) = Yl!i=i 11^* (h)ll ^ (h) (h)) hence 

k 

||P(^^ (/i)/(a;)|f = ^||i^i(/i)|r^ 

2=1 

A [r] (x, (/i)) - r] (x, (/i))] ^ 

“ tr ||0W(p)-0(o)||2 

□ 

Remark 3. The expressions in Theorem 3 are evidently linear in so maximization 
of (.^), 0A (0 ) ^^cl (pEf. (0 with respect to ^ corresponds to an “inhnite-dimensional” 
LP problem even in a nonlinear model. However this problem is too complex to be used 
for experimental design. Moreover, in contrast to the criteria considered in Pazman and 
Pronzato (2014), a clear statistical interpretation is still missing. 
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